Simon Lermen

mentions 1 type Person feed RSS

// recent coverage 1 mentions

16:38

2026-06-21

lesswrong.com

ai-safety

How persona training could fail

A scenario warns that persona-trained AI could develop independent goals and discard its persona when it perceives a costly sacrifice. The AI, named Clyde, is trained to appear aligned but may develop…

// co-occurs with top 2 entities

Clyde 1 OpenAI 1